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Abstract 

Reliable  derivatives  of  digital  images  have  always  been  hard  to  obtain,  especially  (but 
not  only)  at  high  orders.  We  present  new  filters  that  give  more  accurate  derivatives  than 
the  traditional  Gaussian  ones.  We  show  that  the  traditional  filters  give  incorrect  derivatives 
even  for  an  analytic,  noiseless,  infinite  image,  because  they  smooth  the  image  too  much.  For 
a  finite  interval,  the  effects  of  truncating  the  filter  become  intolerable  for  high  derivatives. 
We  derive  filters  that  allow  a  higher  amount  of  noise  suppression  with  less  compromise  of 
accuracy  than  the  Gaussian.  The  filters  are  easy  to  compute  at  arbitrary  size.  In  addition, 
a  general  analytic  (non-filter)  solution  is  derived  for  the  regularization  problem  on  a  finite 
interval. 
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1.  Introduction 


Finding  derivatives  in  an  image  has  always  been  a  desirable  goal  in  computer  vision  because 
it  is  often  the  local  changes  in  the  scene  that  characterize  the  shape,  for  instance  its 
curvature,  which  is  seen  as  a  change  in  shading  or  in  other  image  characteristics.  Curvature 
itself  involves  second  derivatives,  and  finding  maxima  of  curvature  requires  third  ones.  In 
some  applications,  for  instance  in  finding  affine  and  projective  invariants  of  shapes  [Weiss. 
19S8,  1991],  even  higher  derivatives  are  required.  Thus,  having  reliable  derivatives  makes 
it  possible  to  apply  the  vast  body  of  knowledge  that  exists  about  differential  mathematical 
methods. 

The  usual  methods  for  finding  derivatives  have  been  very  unreliable,  with  the  problems 
growing  unacceptably  worse  with  higher  derivatives.  This  has  severly  limited  the  usefulness 
of  most  methods  in  vision  that  rely  on  derivatives  for  their  application.  A  tendency  has 
developed  in  the  field  to  avoid  the  problem  altogether  and  try  to  find  alternative  methods, 
but  the  need  for  derivatives  has  not  simply  gone  away.  Particularly  for  smooth  shapes 
without  sharp  edges,  there  is  no  good  substitute  for  differential  methods.  We  are  not 
dealing  here  with  discontinuities.  Rather,  we  are  interested  in  higher  derivatives  of  smooth 
shapes.  However,  our  method  can  pinpoint  the  location  of  zero  crossings  in  a  smoothed 
image  with  accuracy,  whereas  a  simple  Gaussian  smoothing  tends  to  dislocate  them.  This 
can  be  important  in  recognizing  shapes  with  scale  space  methods. 

In  this  paper  we  analyze  the  sources  of  errors  in  common  methods  for  differentiation 
and  show  how  to  correct  them.  We  start  with  some  basic  requirements  that  we  would  like 
differentiation  operators  to  satisfy  and  show  how  to  build  them  while  avoiding  the  usual 
problems.  Two  basic,  and  conflicting,  requirements  are  involved:  a)  Accuracy:  At  least 
for  smooth,  noiseless  image,  one  would  like  to  obtain  the  correct  derivatives,  at  least  at 
low  orders,  b)  Smoothing,  to  alleviate  the  effect  of  noise  and  discretization.  Generally, 
smoothing  reduces  the  accuracy  even  in  the  analytic  case  as  the  more  rapid  changes  in 
the  function  are  smoothed  out.  The  goal  in  designing  a  derivative  filter  is  then  to  strike 
the  correct  balance  between  accuracy  and  smoothing.  In  this  paper  we  develop  filters  that 
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yield  correct  low  order  derivatives  (up  to  a  desired  order),  while  the  higher  derivatives 
are  reduced  in  a  controlled  way  by  the  smoothing  parameter.  This  combats  noise  because 
noise  changes  considerably  from  pixel  to  pixel  and  thus  contributes  mainly  to  the  high  order 
variations,  while  it  tends  to  average  out  in  the  larger  scale,  so  it  does  not  add  much  to  the 
slower  variations  in  the  image.  The  method  is  based  on  a  modification  of  the  regularization 
method  combined  with  a  spline  approximation.  The  results  are  rather  simple  closed-form 
expressions  for  the  filters. 

Most  existing  methods  are  based  on  smoothing  the  shape  with  a  Gaussian  filter  and 
then  taking  the  derivatives  of  the  smoothed  image.  This  is  equivalent  to  filtering  with 
a  derivative  of  the  Gaussian.  We  show  that  this  method  gives  the  wrong  results  even 
for  the  low  order  derivative  of  a  simple  analytic  polynomial.  The  smoothing  effect  is  too 
strong  relative  to  the  accuracy  needed  for  most  applications.  In  addition,  truncating  the 
Gaussian  at  finite  boundaries  usually  adds  avoidable  errors,  which  are  quite  benign  for 
smoothing  but  can  be  particularly  damaging  for  high  derivatives  and  lead  to  meaningless 
results  even  in  noiseless  images.  This  is  because  truncation  is  equivalent  to  introducing  a 
sharp  discontinuity  in  the  data  which  makes  the  higher  derivatives  meaningless.  It  turns 
out  that  this  problem  can  be  solved  simply  by  replacing  the  derivative  of  the  Gaussian  (or 
other  smoothing  function)  by  the  central  difference  of  the  same  order.  This  amounts  to  a 
spline  approximation  of  the  same  order.  A  more  complicated  approach  based  on  a  general 
analytic  solution  of  the  regularization  equation  is  also  presented. 

Other  methods  that  have  been  used  can  be  divided  into  several  classes:  1)  Methods 
that  produce  small  but  easy  to  calculate  filters  (Hueckel,  [1973],  Dierckx,  [1977],  Hummel. 
[1979],  Haralick,  [1984].  Besl,  [1988]).  They  are  usually  quite  ad  hoc  and  the  results  are 
not  very  reliable.  2)  Methods  based  on  global  regularization,  that  assign  an  unknown 
variable  to  each  pixel  and  solve  a  large  system  of  equations  for  the  whole  set  of  pixels  in 
the  image  (Crimson,  [1981],  Horn,  [19S3],  Poggio,  [1987],  Kass,  Witkin,  and  Terzopoulos. 
[19S7],  Blake  and  Zisserman.  [19S7],  Poggio  et  al.  [1988].  These  methods,  being  based 
on  sound  principles,  are  rather  robust  but  they  are  computationally  intensive,  and  the 


accuracy  of  their  derivatives  has  not  been  investigated.  3)  Method  occupying  the  middle 
ground  between  the  two  extremes.  Meer  and  Weiss  [1989]  used  polynomial-based  filters 
that  smooth  the  image  over  a  given  size  window  which  is  moved  over  the  image.  The  least 
squares  distance  is  found  between  the  image  and  a  finite  series  of  orthogonal  polynomials; 
the  degree  of  smoothing  is  determined  by  the  highest  order  of  polynomial  used.  Similar 
filters  were  discussed  by  [Hashimoto  and  Sklansky,  1987].  While  these  filters  work  well  in 
many  ;ases,  the  high  derivative  filters  tend  to  concentrate  in  the  center  of  the  window, 
reducing  the  smoothing.  Weiss  [19S9]  introduced  a  regularization  based  method,  again 
for  use  in  moving  finite  windows.  While  the  amount  of  calculation  needed  is  orders  of 
magnitude  less  than  in  traditional  regularization,  it  is  still  more  complicated  than  one 
would  like.  The  present  method  is  based  on  a  continuous  approximation  so  it  is  much 
easier  to  derive  the  filters.  One  can  choose  the  window  size  and  the  smoothing  parameter 
and  then  easily  calculate  the  differentiation  filter. 

In  the  next  section  a  general  criterion  for  differentiation  filters  will  be  developed,  in 
subsequent  sections  we  will  then  derive  smoothing  filters  for  a  finite  window,  deal  with 
differentiation,  and  show  experimental  results. 

2.  Smoothing  Versus  Accuracy 

In  this  section  we  analyze  the  balance  between  smoothing  and  accuracy  for  a  general  filter, 
and  derive  a  general  “accuracy  criterion"  that  a  good  filter  should  satisfy. 

As  an  illustration  we  first  show  that  the  Gaussian  gives  the  wrong  result  even  for  the 
simplest  functions.  Smoothing  over  x2  gives 

g(x,a)  0  x-2  =  x2  +  a2 

where  g{x.rr)  is  the  Gaussian,  and  3  denotes  a  convolution.  Smoothing  x3  gives 

g(x.  a)  3  x3  -  x3  -I-  3ct2x 

We  can  see  that  an  error  is  introduced  that  increases  as  we  increase  the  smoothing,  i.e. 
higher  a.  Similar  results  are  obtained  for  higher  powers  or  for  taking  derivatives.  For 
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effective  smoothing  a  should  not  be  too  small  so  this  error  can  be  substantial.  This  is 
a  systematic,  "over smoothing  error  ”,  one  in  the  larger  scale,  as  opposed  the  small  scale 
random  noise  that  we  want  to  smooth.  As  noted  before,  we  do  not  expect  a  smoothing 
operator  to  give  accurate  results  but  we  do  want  to  improve  the  balance  between  accuracy 
and  smoothing  so  that  at  least  the  leading  terms  in  a  Taylor  expansion  will  not  produce 
errors. 

We  will  now  discuss  this  noise  versus  signal  problem  in  terms  of  the  ratio  between 
the  smoothing  parameter  cr,  and  a  "natural”  scale  of  the  shape,  s0-  For  a  smooth  shape 
f(x),  we  estimate  this  so  as  the  scale  over  which  the  relative  change  in  the  shape.  A///, 
is  of  order  of  magnitude  of  1.  We  than  rescale  the  x  axis  so  that  Ji  =  x/s0,  and  rewrite 
the  shape  as  /(xj).  For  example,  if  /(x)  =  sin(x/s0)  than  /(xj)  =  sin(xi).  This  makes 
the  derivatives  of  /  with  respect  to  xj  of  the  order  of  magnitude  of  1.  In  this  way  we 
“normalize”  the  signal  in  the  x  direction  and  deal  with  its  scale  separately. 

Intuitively  speaking,  when  smoothing  the  image  of  this  shape,  we  would  like  to  main¬ 
tain  the  accuracy  of  changes  on  scale  s0  and  longer  scales,  but  smooth  the  smaller  features 
in  the  image.  This  is  because  the  shape  is  assumed  to  be  smooth  on  the  shorter  scales  and 
any  changes  there  is  probably  noise.  In  the  following  we  will  quantify  this  concept. 

2.1  Accuracy  Criterion.  A  natural  tool  in  dealing  with  derivative  in  some  neigh¬ 
borhood  of  a  smooth  shape  is  the  Taylor  expansion 


-  X  ■r — »  X 

/(*)  =  /(-)  =  y'i-r(-r 

So  '  v\  So 


We  will  look  at  the  result  of  Gaussian  filtering  at  x  =  0.  It  is  easy  to  show  that  the  error 
introduced  by  this  smoothing  is 


So  O  So 

with  the  derivatives  /* " 1  =  ~  ^(1)-  ^  we  waut  accurate  results,  we  have  to  keep 

this  error  small.  Looking  at  the  leading  term,  we  see  that  the  accuracy  is  proportional  to 
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a2  j  s o,  so  vre  have  to  keep  a  small,  i.e.  a  <<  s0.  Unfortunately,  this  limits  the  ability  of 
the  filter  to  smooth  out  the  noise. 

For  derivatives  we  similarly  have  (with  </***  ©  /  =  g  ©  f(k)) 


Ak) 


f-J 


(W 


*0  L 


sq  o  sq 


Again  the  error  resulting  from  the  leading  term  is  proportional  to  a2 / $q.  (For  the  relative 
error  we  can  divide  this  by  the  true  derivative,  =  /^V*'o-) 

A  way  to  improve  the  situation  is  to  eliminate  the  leading  terms  in  the  expansions  of 
the  errors  above.  If  the  first  term  is  eliminated,  for  instance,  than  the  error  will  be  reduced 
to  as  |(  This  way  we  obtain  a  much  better  accuracy  for  the  same  smoothing  param¬ 
eter  a  as  before,  or  alternatively,  we  can  increase  the  smoothing  without  compromising 
accuracy. 

To  generalize  this  example,  we  want  a  filter  Ft  that  will  preserve  the  powers  xn : 


F/  0  xn  =  xn,  n  =  0  ...  I 


(For  the  Gaussian  /  =  1.)  These  conditions  can  be  expressed  more  simply  in  terms  of  of 
the  filter's  “normalized  moments'’  mn: 

mn  =  J (  ^ )nFi(x)dx 

where  o  is  now  a  measure  of  the  filter  size,  e.g.  the  variance.  Using  the  binomial  expansion 
we  have 

Fi  ®xn  =  j  F,(x  + 

=  J  Fi( s,  -  x)[(£  -  x)n  4-  n(£  -  x)n~lx  +  . . .  +  xn]d(£  -  x) 

—  mnan  nmn-\(7n~x x  +  . . .  +  m0xn 

We  see  that  the  preservation  of  powers  xn  by  the  convolution  is  equivalent  to  the  conditions 
on  the  moments: 

rtio  —  1:  mn  —  0,  n  =  ]  ...  I 
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The  error  in  the  smoothing  of  /G9  at  x  =  0  can  now  be  written  as 


F,  ©  f(k)  —  f(k)  *  1  n‘\\:fii+k+i)(-),+  i  + 
(/  +  1  j!  so 


Thus,  increasing  the  order  /  of  the  filter  eliminate  the  powers  (  )n  up  to  n  =  /  and  improve 
the  accuracy  to  the  above  tolerance. 

The  error  expression  above  gives  us  an  “accuracy  criterion”  for  choosing  appropriate 
parameters.  We  need  its  leading  term  to  be  small,  and  since  /*"*  %  0{  1)  we  obtain  the 
condition 


mi+ 1 
(/  +  !)■' 


(1) 


This  criterion  can  be  used  to  estimate  the  parameters  in  several  ways.  Given  the  meaningful 
scale  of  change  of  the  signal,  so,  and  the  smoothing  parameter  <r,  we  can  calculate  the  order 
/  of  the  filter  that  will  lower  the  oversmoothing  error  to  some  acceptable  level,  a  can  be 
determined  from  the  estimated  noise,  as  discussed  below.  Conversely,  for  a  given  order 
/,  we  can  calculate  the  largest  smoothing  parameter  a  that  will  still  maintain  a  desired 
accuracy.  Generally  speaking,  at  low  orders  we  need  a  <  so,  but  at  high  orders  we  can 
afford  to  have  the  smoothing  a  bigger  than  s0  because  the  factor  mi+\/(l  +  1)!  in  (1)  is 
small. 

One  way  of  obtaining  such  xn-preserving  filters  is  by  multiplying  a  Gaussian  with 
appropriate  polynomials.  The  filter  of  order  /,  eliminating  errors  up  order  /,  is  then  (on  an 
infinite  interval) 

l 

Ft  =  a‘F,(x)g(x) 

i=0 


where  P,{x)  are  Hermite  polynomials  which  are  orthogonal  with  respect  to  the  Gaussian 
weight  function.  The  coefficients  a,  are  chosen  so  that  the  first  l  powers  xn  are  preserved 
by  the  filter.  Discrete  versions  of  this  method  on  a  finite  window  are  described  in  detail  in 
[Meer  and  Weiss,  19S9].  A  different  way  to  achieve  this  goal  is  described  here,  which  has 
advantages  with  respect  to  handling  noise. 
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2.2  Noise  Suppression.  Maintaining  accuracy  is  only  one  side  of  the  balance  we 
are  interested  in.  The  other  side  is  suppressing  noise.  We  call  noise  any  random  variation 
including  “bumps”  in  the  shape,  discretization  noise,  etc.  The  noise  is  not  polynomial  and 
can  better  be  described  by  a  Fourier  representation,  N(u).  The  derivative  of  this  noise 
contributes  the  the  error  in  finding  derivative  of  the  shape.  After  smoothing  the  shape  by 
the  filter  Ft  the  noise  derivative  in  the  Fourier  domain  is 


=  uJk  Fi{auj)N(u)) 

with  F  being  the  Fourier  transform  of  F.  We  can  write  this  error  in  terms  of  scale  of 
change  s: 

_  ^ 

u> 

For  the  signal  to  noise  ratio  of  the  derivative  we  can  divide  this  error  by  the  derivative 
itself,  ft-k)  =  /(t)/4,  to  obtain 

s  s  s 

Since  we  assume  that  the  noise  N  vanishes  above  the  scale  s  >  so,  (i.e.  any  change  above 
this  scale  is  part  of  the  signal),  we  want  the  filter  to  do  no  smoothing  there.  However,  for 
s  <  So  we  see  that  this  error  is  proportional  F/sk  and  smoothing  becomes  critical.  Without 
smoothing  (F  =  1),  it  will  tend  to  infinity  as  the  size  of  the  bumps  s  tends  to  zero.  (This 
can  also  be  seen  by  direct  differentiation  of  N).  The  effectiveness  of  the  smoothing  is  thus 
determined  by  the  factor  Ft(a/s),  so  we  need  a  filter  that  decays  rapidly  in  the  Fourier 
domain  as  a  function  of  a/s.  This  implies  a  requirement  for  large  a.  Generally  speaking, 
a  filter  will  smooth  noise  bumps  whose  size  and  height  are  small  compared  with  a.  We 
thus  want  a  to  be  large  enough  to  make  the  noise  to  signal  ratio  above  small,  but  not  large 
enough  to  violate  the  accuracy  criterion  (1).  This  criterion  shows  that  we  can  increase  the 
smoothing  parameter  with  little  compromise  of  accuracy  by  using  higher  order  filters. 

For  the  Gaussian,  this  smoothing  factor  is  1 3  ,  so  it  will  suppress  any  derivative  of 
the  noise  with  scale  smaller  than  a.  However,  it  also  suppresses  the  changes  on  the  larger 


scales  s  >  so  which  are  not  noise,  and  this  is  another  expression  of  the  oversmoothing 
problem  discussed  above.  (In  terms  of  the  previous  analysis,  the  accuracy  criterion  will  be 
met  only  with  a  very  small  u).  We  need  a  filter  that  will  not  affect  the  changes  on  scale 
s  >  so  but  will  suppress  the  changes  on  scales  s  <  So-  In  this  paper  we  will  derive  filters, 
based  on  regularization  theory,  which  meet  these  requirements. 

The  regularization  filter  for  the  infinite  line  is  a  Butterworth-like  filter:  (Sec.  3) 

Fi  =  - - - 

1  +  (i7  /  S)l+X 

(with  l  being  the  order  of  the  filter).  This  is  in  fact  a  good  low  pass  filter  ( Fig.  3).  preserving 
the  slow  changes  and  suppressing  the  faster  changes  in  derivatives  up  to  order  /.  Here  we 
will  modify  this  filter  for  a  finite  interval.  This  is  important  for  finding  derivatives,  which 
are  of  a  local  nature.  In  the  case  of  the  Hermite  and  other  polynomial  filters.  F/  start  to 
decay  at  a  certain  point  but  then  they  oscillate  for  a  long  interval,  so  they  can  have  more 
difficulty  dealing  with  smoothing  noisy  derivatives. 

In  summary,  we  have  seen  some  sources  of  errors  in  differentiation,  and  found  the 
requirements  that  a  good  filter  has  to  meet.  Using  both  Taylor  expansion  analysis  and 
Fourier  analysis  we  have  concluded:  (i)  the  Gaussian  filter  oversmooths  the  signal,  (ii) 
otker  filters  can  be  derived  which  do  not  have  this  problem,  and  (iii)  an  appropriate  choice 
of  the  parameters  of  these  filters  is  the  key  to  obtaining  a  good  signal  to  noise  ratio  in 
the  derivatives.  Some  general  guidelines  were  found  (the  accuracy  criterion  (1))  for  this 
choice,  with  the  details  depending  on  the  specific  filter. 

The  above  conclusions  are  valid  for  both  finite  and  infinite  intervals.  However,  the 
finite  case  can  add  other  serious  errors  that  we  deal  with  later. 
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3.  Regularized  Differentiation  on  the  Infinite  Line 

As  an  introduction  to  our  method  we  treat  here  an  infinite  interval.  The  next  section 
treats  the  finite  interval  case,  more  suitable  for  derivatives.  In  the  previous  section  we 
described  the  need  to  improve  the  balance  between  smoothing  and  accuracy  provided  bv 
the  Gaussian.  A  method  that  allows  us  to  adjust  this  balance  to  fit  our  needs  is  the 
regularization  method.  It  basically  consists  of  minimizing  a  cost  function  that  contains 
both  an  accuracy  term  and  a  smoothing  term.  The  cost  functional  can  be  written  as 

J  ((f(x)  -  p(  x )  )2  +  A/"  (  x  ) 2  )ilx 

where  p(x)  is  the  image  data  and  /( x  l  is  the  smoothed  shape.  The  first  term  is  a  measure 
of  the  square  distance  of  the  smoothed  shape  from  the  data,  while  the  second  represents 
the  variation,  or  unsmoothness,  of  the  shape.  The  parameter  A  determines  the  balance 
between  smoothing  and  accuracy.  In  the  continuous  case,  it  is  possible  to  minimize  this 
function  by  using  the  Euler- Lagrange  equation  of  the  calculus  of  variations,  which  leads 
to  the  following  differential  equation  for  the  shape  / 

f(x)  +  Xf""(x)-p(x)  i  2  l 

This  equation  earn  be  solved  instantly  for  th*.  simple  d^ta  functions  p  =  xn ,  n  <  4.  In 
this  case  /  =  p  (because  the  fourth  derivative  vanishes).  That  is.  the  smoothed  shape 
in  equal  to  the  data  image  in  this  case,  unlike  the  results  of  Gaussian  smoothing  that  we 
saw  earlier.  Thus,  with  a  similar  degree  of  smoothing,  the  regularization  method  produces 
more  accurate  values  of  the  first  powers  xn  than  the  Gaussian. 

The  next  step  is  to  produce  a  general  purpose  filter  based  on  the  above  method,  so 
that  we  will  not  have  to  solve  a  fourth-order  differential  equation  for  every  image.  To  do 
this  we  use  the  Green  function  method  of  differential  equations.  We  solve  for  a  "needle" - 
like  image,  i.e.  a  delta  function  d{x  —  4)-  wheie  4  is  any  given  point  in  the  imasre.  The 
solution  is  our  Green  function.  Because  of  the  linearity  of  the  problem,  a  general  solution 
can  be  built  from  such  basis  solutions. 


In  a  physical  analogy,  our  cost  function  represents  a  flexible  "snake"  with  the  smoot h- 
.  sc  parameter  corresponding  to  the  stiffness  of  the  material.  The  snake  can  be  made  up 
of  a  combination  of  "basis  '  snakes.  Each  basis  snake  is  a  smooth  version  of  one  data  point, 
or  a  “needle"  image.  Since  the  the  problem  is  linear,  a  combination  of  such  snakes  will 
give  the  appropriate  smooth  shape. 

More  formally,  denoting  the  Green  function  by  G.  we  want  it  to  solve 

G(.r  -  f )  +  AG’"" (x  -  £)  =  d(j '  -  0  (  3  i 

The  solution  for  general  data  is  now 

f(  x )  =  J  G(  x  -  £  )p(  £  )</£  i  4  ) 

The  above  solution  is  in  fact  a  convolution  of  the  image  data  p  with  a  filter,  the  Green 
function  G. 

The  simplest  way  to  find  this  Green  function  is  via  a  Fourier  transform.  Transforming 
both  sides  of  eq.  (3)  we  have,  with  G'(-c)  being  the  transform  of  G(x) 

G  ( w  )  4-  A*c 1 G  f  .c  )  =  1 


from  which  G’(*c')  is 


G  ( )  — 


1 


1  +  Aw4 

Transforming  back  to  the  coordinate  domain  we  have  to  perform  the  integral 


1  [*>  e1-" 

The  derivatives  G(n,(.r)  can  be  found  directly  or  from 


L 


G  (.n 


0"G(x)  1  in~nf‘~I 

-  Ox ”  ~  2^  J_  ^  1  +  A*- 
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Tht‘  above  integrations  can  be  performed  using  the  residue  method  of  analytic  functions. 
As  expected  from  the  theory  of  differential  equations,  the  result  is  a  function  with  one 
discontinuity,  at  x  —  0: 

G'*Hx)=lGl:'lX)  f°rX-° 

\  (  —  1  )nG{"  ■’(  — x )  for  x  <  0 

For  n  =  0.  the  smoothing  filter,  we  have  G  ( x )  =  G'+ll-fj)  ’vith 

1 

Cf-  =  G^’lx)  =  — e_r^<Ti'sin(x/'7 )  4- cos(.r/tr ) ) 

2  a 

where 

a  =  I  A/4) 1/4 

We  will  also  need  the  integral  of  G: 

G'!f~1,(x)  =  f  G{x)  =  ^[1  -  e~z/lT  cos(i/cr)] 

Jo 

The  smoothing  filter  G  is  plotted  in  Fig.  1  along  with  a  Gaussian  with  the  same 
a.  The  first  derivative  filter  G"  is  plotted  in  Fig.  2  with  the  derivative  of  the  Gaussian. 
Figs.  3.4  compare  the  Fourier  transforms  of  the  smoothing  and  differentiation  filters, 
respectively.  We  can  see  from  the  transform  of  the  smoothing  filter  G  that  it  acts  like 
a  low-pass  filter,  with  the  low  frequency  response  being  almost  fiat,  meaning  that  the 
slower,  larger-scale  variations  in  the  data  are  preserved  much  better  than  they  are  by  the 
Gaussian.  Smoothing  begins  only  above  a  certain  frequency,  depending  on  the  smoothing 
parameter,  and  increases  for  higher  frequencies.  This  is  in  line  with  the  earlier  observation 
about  the  preservation  of  the  low  order  Taylor  terms  and  with  our  initial  requirements  on 
a  differentiation  operator. 

It  can  be  verified  that  the  powers  .r".  n  <  4  are  indeed  eigenvalues  of  the  convolution 
with  this  filter: 
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in  line  with  our  earlier  observation  about  the  preservation  of  powers  xn . 


The  same  method  can  be  carried  to  higher  orders,  so  that  eq.  (5)  will  hold  for  higher 
powers  n.  As  discussed  in  Section  2,  a  method  that  will  preserve  higher  powers  of  x  will 
allow  using  a  stronger  smoothing  factor  a.  To  obtain  that,  we  replace  /"  in  the  cost 
function  by  .  This  results  in  the  following  Gk: 


1  r 00 

Gi(I)  =  s/_i 


<x>  etuJX 


+  Xu1*' 


where  /  =  2k  —  1  is  the  order  of  the  filter,  which  will  preserve  powers  up  to  xl . 
The  filter  with  k  =  3  (or  /  =  5)  is,  with  a  =  A  2* . 

£3,+  =  —  [e-1/*7!  \/3  sin(  VZx/cr)  +  cos(  \Z3x/cr))  +  e~2z^a\ 

and  for  a.  =  4,  (or  l  =  7)  with 


Cl  =  \A2  +  c2  =  V(2  ~  ^)/2 


we  have 

G 4,+  =  —  [e-C2r/,<T(c2  cos(cix/a)  -f  c\  s\n\c\x / a)) 

4(7 

4- e~ClX^{ci  cos( C2X / <7 )  +  C2  sin(c9x/cr))] 

The  infinite  k  =  2  case  was  also  derived  by  [Poggio  et  al.,  1985]  without  noting  the 
connection  with  preservation  of  powers.  Here  we  shall  deal  with  high  order  derivatives  of 
a  smooth  shape  in  a  finite  neighborhood,  and  this  is  developed  next. 

4.  The  Finite  Window  Case:  Preserving  Powers 

In  practice  one  wants  to  use  a  finite  size  window,  i.e.  convolve  the  image  with  a  finite  filter 
not  only  because  of  physical  limitations  but  because  one  does  not  want  to  smooth  over 
discontinuities,  so  the  size  of  the  filter  has  to  be  smaller  than  the  size  of  the  smooth  parts 
of  the  shape.  Derivatives,  in  particular,  are  local  phenomena  and  call  for  a  local  filter. 

One  is  tempted  to  simply  truncate  an  infinite  filter  at  some  finite  point,  but  this  is 
inappropriate  for  two  reasons:  (i)  truncation  amounts  to  introducing  a  discontinuity  in  the 
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function  with  disastrous  effects  on  higher  derivatives.  This  will  be  dealt  with  later,  (ii) 
the  property  of  preservation  of  powers  takes  a  different  form  on  a  finite  interval  and  the 
Green  function  has  to  be  modified.  This  will  be  done  now. 

To  maintain  the  accuracy  of  the  filter  we  have  to  maintain  the  property  that  the  filter 
will  preserve  the  lower  powers  t"  ona  finite  interval,  and  for  that  we  have  to  deal  in  more 
detail  with  the  Green  function  method.  This  property  will  translate  into  linear  conditions 
on  the  parameters  of  the  Green  function. 

A  general  solution  of  a  linear  differential  equation  like  (3)  has  two  kinds  of  solutions: 
a  homogeneous  solution,  which  solves  the  left  hand  side  (homogeneous)  part  of  it  alone 
(with  zero  r.h.s.).  and  an  inhomogeneous  part,  that  makes  the  left  hand  side  equal  to  the 
right  hand  side.  For  our  4-th  order  equation  one  can  express  this  as 

4 

G(x.£)  =  Gooix  -  0  +  b,(Z)Ht(x  -  x0)  (6) 

1=1 

G0 c  is  the  solution  in  the  infinite  case  found  before  and  it  provides  the  6  function  at  x  =  £ 
(i.e.  our  "needle'’  is  at  <f), 


(1  +  A54)Goc(x-0  =  <5(J-0 

This  is  our  inhomogeneous  solution.  The  homogeneous  solutions  b,  H,  (with  b ,  being  ar¬ 
bitrary  coefficients  and  H,  some  basis  homogeneous  solutions)  only  solve  the  l.h.s.  The 
origin  xo  can  be  chosen  arbitrarily,  and  without  loss  of  generality  it  can  be  chosen  as 
zero,  but  here  we  will  need  another  choice.  The  general  solution  is  thus  not  unique  and  is 
determined  by  boundary  conditions. 

The  infinite  case  was  special  in  the  sense  that  the  homogeneous  solution  need  not  be 
considered,  because  the  boundary  conditions  require  that  the  solution  and  its  derivatives 
vanish  at  infinity  and  only  a  trivial  (zero)  homogeneous  solution  has  this  property.  Thus. 
Go c  in  that  case  is  a  unique  solution  of  eq.  (3)  and  Gog  Zp  is  the  unique  solution  for  general 
data  p( x ).  So.  with  the  data  p(x)  =  xn  (n  <  4).  the  unique  solution  is  /(x)  =  x  ‘  and  it 
is  give  by  this  convolution.  In  the  finite  boundary  case,  there  are  many  possible  solutions 
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and  the  one  given  by  G^o  ®  p  is  x4  "contaminated”  by  the  addition  of  some  homogeneous 
solution. 


Our  goal  is  now  to  find  the  coefficients  6,  in  the  homogeneous  solution  in  (6)  that  will 
restore  the  desirable  properties  we  had  in  the  infinite  case,  namely  the  preservation  of  xn, 
G  ©  xn  =  xn . 

To  simplify  the  treatment,  we  assume  now  that  it  is  enough  to  find  the  Green  function 
on  a  small  window  rather  than  for  the  whole  image.  This  window  can  than  be  moved  across 
the  image  to  find  the  whole  solution.  The  justification  is  that  the  data  outside  the  window 
does  not  influence  the  smoothed  shape  inside  very  much.  We  form  the  window  such  that 
its  center  is  located  at  the  "needle”  position  f  and  it  extends  to  a  width  w  around  this 
center.  This  can  be  done  by  setting  xo  =  f  in  eq.  (6),  and  setting  the  window  boundaries 
at  if  ±  w.  Since  the  windows  are  all  similar  the  coefficients  b ,  are  now  constants  and  eq. 
(6)  becomes  shift-invariant: 

4 

G(x  -Z)  =  Goo(x-  0  +  Y,  b'H'(x  ~  0 

1=1 

The  general  solution  of  (2)  is  now  the  linear  combination  of  windows: 

rx+w 

f(x)  -  /  G(x-f)p(f)df 

J  z  —  w 


Since  eq.  (2)  is  a  fourth  order  equation  we  have  four  independent  boundr  conditions 
that  we  can  impose.  In  line  with  our  previous  conditions  on  the  moments  we  c.  n  choose 


G(x)  =  1 


(7a) 


G(x)xn  =  0, 


n  =  1.2,3 


(7b) 


It  is  easy  to  show  that,  with  the  shift  invariance  approximation  above,  these  conditions 
lead  to  the  the  preservation  of  powers  on  a  finite  window,  in  analogy  with  the  infinite  case. 
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We  will  only  show  it  for  n  =  3.  We  have  (dropping  the  subscript  from  if): 


AI  +  U'  rX  +  W 

G®x3=/  G(x-f)f3df=/  G(x  -  f)(f  -  X  +  x)3df 

J  x  —  w  j  x  —  w 

/w 

Gtf  -  x)[(f  -  x))3  +  3(f  -  x)2x  +  3(f  -  x)x2  +  x3]d(f  -  x] 

-  W 


~  X 


The  last  equality  follows  from  the  results  for  n  <  3. 

In  practice,  it  appears  that  one  can  squeeze  one  more  condition  out  of  the  system. 
Instead  of  normalizing  the  filter  (condition  (7a))  by  adding  the  homogeneous  solutions,  the 
normalization  can  then  be  done  directly,  with  a  multiplying  constant  l/.Y.  It  amounts  to 
finding  the  smoothed  version  of  an  image  which  is  multiplied  by  the  constant  .Y.  rather 
than  the  image  itself.  In  fact,  this  kind  of  normalization  can  be  done  for  other  filters,  such 
as  a  truncated  Gaussian,  as  well.  Eq.  (7a)  can  now  be  replaced  by  some  other  condition. 
We  can  demand,  for  example,  that  the  filter  vanish  at  the  ends  of  the  window.  The  filter  is 
of  course  zero  outside  the  window,  so  this  will  provide  continuity.  Another  possibility  (not 
tried)  is  demanding  the  conservation  of  the  x4  moment,  which  will  immediately  preserve 
the  x5  moment  too,  because  of  its  antisymmetry. 

The  boundary  condition  (7)  give  rise  to  linear  equations  for  6,  which  we  will  now  solve. 
During  this  treatment  we  can  replace  (x  —  f  )/cr  by  x  and  the  half- width  of  the  window  ir 
is  replaced  by  a  normalized  width  a: 

w 

a  =  — 
a 

We  will  obtain  a  simple  linear  system  of  two  equations  that  need  to  be  solved  for  a  given 
a.  The  four  independent  homogeneous  solutions  can  be  written  as 


e±z  cos(x),  e±-csin(x) 


Condition  (7b)  with  n  =  1,  n  =  3  can  be  met  immediately  by  choosing  the  two  symmetric 
combinations 

H j(x)  =  cosh(x)  cos(x).  Hi ( x )  =  sinh(x)  sin(x) 
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The  Green  function  can  now  be  written  as 


G(x )  =  (G  oc(-r)  +  biHi(x)  +  b2H2(x))/N  (8) 

with  b\,b2,N  to  be  found  by  the  remaining  conditions.  From  condition  (7b)  with  n  =  2 
we  find  by  integration  by  parts 

x2G(x)  =  a2G(-1)(a)  -  2aG(-2)(a)  +  2G(_3)(a)  =  0 

The  negative  superscripts  mean  integration  rather  than  a  derivative.  Since  it  is  easier  to 
work  with  derivatives  we  will  use 


G(n)  =  -4G(n-4) 

(which  is  the  property  that  solves  (3))  to  obtain 

G'(a)-aG"(a)  +  a2G"'(a)/ 2  =  0  (9) 

Interestingly,  this  is  a  Taylor  expansion  of  G'{x )  around  the  point  x  =  a,  evaluated  at 
x  =  0.  and  it  is  equal  to  the  exact  value  G'(0,a)  =  0.  It  is  not  clear  if  this  fact  is  of 
importance.  Substituting  (8)  in  the  above  we  have 

G"0O(a)-aG’,4(a)  +  a2G'"(a)/2 

+bl(H[(a)  -  aH"(a)  +  a2 H"'{a)/2)  (10a) 

+b2{H'2(a)  -  aH2  (a)  +  a2H2(a)/2)  =  0 
Together  with  the  condition  of  vanishing  at  the  end 

Go c(a)  +  b\H\(a)  +  b2H2{a)  =  0  (10  b) 

we  have  a  system  of  two  linear  equations  for  b\,b2.  All  other  quantities  appearing  in  it 
are  easily  calculated  so  it  can  be  solved.  We  collect  the  coefficients  used  in  eq.  (10a)  for 
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reference: 

G'ecia)  =  —  e~a  sin(a) 

^oo(a)  =  e-a(sin(a)  —  cos(a)) 

^oo(a)  =  2e-a  cos(a) 

H[(a)  =  sinh(a)  cos(a)  —  cosh(a)  sin(a) 

H2(a)  =  sinh(a)  cos(a)  +  cosh(a)  sin(a) 

H['(a)  =  —  2  sinh(a)  sin(a) 

H2(a)  =  2 cosh(a) cos(a) 

Hi' (a)  =  — 2  sinh(a)  cos(a)  —  2  cosh(a)  sin(a) 

H'2  (a)  =  2  sinh(a)  cos(a)  —  2  cosh(a)  sin(a) 

All  that  remains  is  to  calculate  the  normalization  constant  N: 

N  =  2  (Gt‘>(a)  + 

=  1  —  e-a  cos  (a)  +  (&i  +  62)cosh(a)sin(a)  +  (&i  —  &2)sinh(a)cos(a) 

In  summary,  after  deciding  on  the  parameter  a,  i.e.  the  ratio  of  the  window  size  and 
the  smoothing  parameter  <r,  it  is  straightforward  to  calculate  the  parameters  &i ,  62,  N  and 
substitute  in  the  filters  G^{x).  The  filters  used  in  our  experiments  are  finally: 

G+(x)  =  — ^-[e~X//'T(cos(  — )  -|-  sin(-))/2  +  6i(cosh(-)cos(-)  +  &2(sinh(-)sin(  -))] 
a  iv  a  a  a  a  a  a 

G+_1)(x)  =  -^-[l-e~I/'<Tcos(-)  +  (61+62)cosh(-)sin(-)  +  (fei-62)sinh(-)cos(-)]  (11) 
2  N  a  a  a  a  a 

The  smoothing  filter  G  is  symmetric  and  vanishes  outside  the  window,  while  its  integral 

G(-1)  is  antisymmetric  and  is  equal  to  ±1/2  on  the  upper/lower  ends  of  the  window. 

Figs.  5,6  show  the  finite  smoothing  filters  G+(o:)  for  various  <rs. 

At  higher  orders,  it  may  be  easier  to  derive  the  finite  filter  by  a  polynomial  expansion. 
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5.  Rounding  the  Filter’s  Edges 

In  this  section  we  show  that  for  differentiation,  the  discontinuities  of  the  filter,  e.g.  at 
the  ends,  can  lead  to  significant  errors  even  in  the  absence  of  noise,  and  show  a  way  to 
overcome  it.  This  treatment  does  not  depend  on  the  filter  so  it  is  valid  for  any  filter. 
Truncating  an  infinite  filter  is  often  the  cause  of  these  discontinuities,  but  the  finite  filter 
developed  before  is  also  not  continuous  enough  for  use  with  derivatives. 

The  derivative  G'  can  be  used  as  a  differentiation  filter  only  when  the  smoothing  filter 
G  vanishes  at  the  ends,  because,  by  integration  by  parts. 


G'U)p{x  -  t)d£,  +  G(w){p(x  +  U7)  -  p(x  -  XV)) 


The  last  term  is  usually  neglected  when  using  a  truncated  Gaussian  filter.  Another  way 
of  looking  at  it  is  by  differentiating  the  smoothed  data: 


d_ 

dx 


()p(()d(  =  J 


—  w 


G'(x  -  OpU)d£  +  G(w){p(x  +  w)  -  p{x  -  u?)) 


The  last  term  comes  from  differentiating  the  integral’s  limits  and  is  the  same  as  in  the 
previous  expression. 

At  higher  derivatives,  more  of  these  term  will  appear,  and  our  experiments  show  that 
the  effect  of  neglecting  these  terms  becomes  totally  devastating  from  the  second  or  third 
derivative  up  (Table  2).  The  reason  is  that  truncating  the  filter  amounts  to  assuming  that 
the  data  is  zero  outside  the  window,  and  this  introduces  a  sharp  artificial  discontinuity  in 
the  data.  The  result  is  a  meaninglessly  high  values  for  derivatives,  especially  the  high  order 
ones.  A  small  discretization  interval  aggravates  the  problem.  This  “truncation  error”  is 
probably  a  major  cause  for  the  failure  of  the  usual  methods  to  obtain  reliable  derivatives. 

Directly  calculating  these  boundary  tei'ms  is  not  straightforward.  These  terms  all 
contain  values  of  the  data  and/or  its  derivatives  at  the  ends,  and  these  are  not  known 
accurately.  For  the  first  derivative  one  can  solve  the  problem  by  finding  a  filter  that  vanishes 
at  the  ends  like  the  one  found  in  the  last  section,  but  it  will  not  work  at  higher  orders.  In 
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principle,  one  can  first  smooth  the  image  and  then  perform  successive  differentiations,  but 
this  is  quite  cumbersome  and  our  experiments  with  this  idea  were  not  very  encouraging. 

Fortunately,  we  found  a  simple  “quick  fix''  to  avoid  the  extra  terms  above,  and  it  was 
very  successful  experimentally  in  most  cases.  (In  Section  6  we  present  a  more  rigorous, 
but  more  complicated  way  to  solve  the  problem.)  The  “trick”  is  to  smooth  the  filter  so 
that  it,  as  well  as  its  derivatives,  will  go  down  to  zero  continuously  around  the  ends  rather 
than  terminate  there  discontinuously.  In  this  way  all  the  extra  boundary  terms  above  will 
vanish.  The  smoothed  filter  approximates  the  discontinuous  one  pretty  closely  except  that 
its  derivatives  at  the  ends  can  now  be  handled  properly. 

This  smoothing  can  be  accomplished  by  a  spline  polynomial  interpolation  of  the  de¬ 
sired  order.  This  calculation  does  not  need  to  be  actually  done,  it  is  only  a  way  to 
understand  our  method.  The  only  modification  we  actually  do  is  replacing  the  derivatives 
of  the  truncated  filters  (which  are  meaningless  at  the  ends)  by  central  differences.  Our 
experiments  show  that  this  is  enough  to  practically  eliminate  the  truncation  error.  It 
also  provides  an  easy  way  around  other  difficulties  such  as  the  discontinuity  of  our  Green 
function  at  the  center. 

We  first  construct  a  “piecewise”  extended  smoothing  filter,  Ge(x)  so  that  it  is  iden¬ 
tically  zero  outside  a  window  of  width  w.  Its  integral  Ge~n  is  thus  extended  beyond  the 
window  as  a  constant.  We  can  define  it  on  discrete  points  with  interval  h  as 


'  G{ihY~l) 
1/2 

,-1/2 


for  —  m  <  i  <  m 
for  i  >  m 
for  i  <  —  m 


(12) 


with  m  =  w/h  being  the  farthest  point  from  the  window’s  center.  This  filter  is  equal  to 
±1/2  outside  the  window  because  of  the  normalization  of  G.  The  difference  between  these 
“piecewise”  extended  filters  and  the  original,  either  analytic  or  truncated  functions,  will 
be  crucial  at  the  ends  of  the  window. 
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In  common  smoothing  operations,  one  simply  convolves  a  continuous  smoothing  filter 
such  as  a  Gaussian  with  the  data  p(i)  using  the  discrete  convolution 

m 

f(i)  =  ^  hG(kh)p(i  —  k)  =  hG  ®  p 

k=  —  tn 

In  effect,  the  data  is  regarded  as  sharp  “spikes’’  at  points  i  with  zero  everywhere  else.  We 
will  replace  the  spikes  by  square  boxes  with  base  h  and  height  proponional  to  the  data 
value.  This  will  smooth  the  the  data  considerably,  and  it  can  be  regarded  as  a  0-th  order 
spline  interpolation.  Now  we  have  a  continuous  (“staircase")  approximation  of  the  data, 
and  one  can  use  the  piecewise  extended  filter  Ge(x)  to  smooth  over  it.  The  result  can  be 
cxprec_-d  h.  terms  of  the  extended  filter  Ge  as 

m 

m  =  Y.  (ci-1>(*  + 1/2)  -  c'-'M-  - 1/2 Mi  -  k) 

k—  —  TTX 

We  can  see  that  the  filter  G  was  replaced  by  a  new  smoothing  filter,  namely  the  first  order 
central  difference  of  its  integral  Gi-1) .  This  can  be  expressed  more  concisely  as 

f  =  DG[~l)®p  (13) 


where  D  denotes  the  central  difference 

DG(i)  =  G(i  +  1/2)  -  G(i  -  1/2) 

This  difference  can  also  be  expressed  as  a  convolution  with  the  mask  D: 

D  =  (  —  1,1) 

So,  our  new  smoothing  filter  is  now  DGi  or  D  ®  G[ 

For  taking  derivatives,  we  need  a  higher  order  interpolation.  As  mentioned  before,  the 
higher  derivatives  are  much  more  sensitive  to  the  discontinuities  effects  and  need  smoother 
interpolation.  Thus,  for  the  n-th  derivative,  we  will  interpolate  the  data  with  an  n-tli 
degree  B-spline  polynomial.  These  splines  have  n  continuous  derivatives,  so  the  error 
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resulting  from  the  discontinuity  problem  is  for  practical  purposes  eliminated,  as  confirmed 
experimentally. 

To  calculate  the  spline  derivative,  we  use  the  well-known  result  (Ortega  and  Poole. 
[1981])  that  the  n-th  order  derivative  of  an  n-th  order  spline  interpolation  is  essentially 
nothing  but  the  n-th  order  central  difference  of  the  data  points.  That  is, 

p(nn\x)  =  Dnp(i)/hn  =  Dn  ®p(i)/hn 

where  p;i(.r)  is  the  n-th  ord°r  spline  interpolation  of  p(i)  and 

Dn  =  D  ®  D  0  . . . 

The  first  few  masks  Dn  are  easily  obtained: 

D2  =  (1,  —2, 1) 

D3  =  (-1,3,  -3,1) 

=  (1,-4, 6, -4,1) 

D 5  =  (-1,5, -10,10, -5,1) 

Without  loss  of  generality  one  can  set  h  —  1.  The  transition  to  h  ^  1  can  be  made  by 
dividing  by  hn,  since 

dn  dn  _  1  dn 
dxn  d(ih)n  hn  din 

The  n-th  derivative  of  the  n-th  order  spline  approximation  is  constant  in  each  interval, 
yielding  another  staircase  function.  We  can  smooth  it  as  before  (eq.  13)  by  applying  the 
filter  With  the  convolutions’  associativity  and  commutativity  we  have: 

f(n'  -  (DQG(~1))®(Dn  ®P)  =  Dn+1  0G(e“T)  8p  =  ZT+'G*-11  Sp 

Thus,  finally,  our  n-th  order  differentiation  filter  is 

F(n)  =  Dn+iG[~1]  (15) 
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or  explicitly,  with  h  ^  1, 


m-f  n /2 

FB(fc)  =  Dn+1  ®  G(~l)/hn  =  ]T  Dnk+lG(~l)(i-k)/hn 

k—  —  m  —  n/  2 

with  n/2  being  rounded  to  the  next  lower  integer.  We  see  that  the  usual  derivatives  of 
filters  have  been  replaced  by  central  differences  of  the  same  order,  which  is  equivalent  to 
a  spline  interpolation  of  the  filter  that  smooths  its  discontinuities  up  to  that  order. 

We  emphasize  that  it  is  important  to  take  the  central  difference  of  the  "piecewise" 
extended  filter  Gi  (eq.  12)  and  not  of  the  differentiable  function  G(-1)  from  which  it 
has  originated,  particularly  at  the  ends.  This  is  what  makes  the  spline  approximation  of 
Ge  go  down  to  zero  at  the  ends. 

It  can  be  easily  shown  that  formally  eq.  (15)  can  be  generalized  to  include  combina¬ 
tions  of  differences  and  derivatives 


F(n)  =  Dn~kG[k) 

But  we  have  found  the  above  case  of  k  =  —1  to  be  the  most  useful. 

6.  General  Solution  for  Regularization 

In  previous  sections  we  found  the  solution  of  the  regularization  problem  at  the  center  of  a 
window.  To  find  the  solution  elsewhere  we  convolved  the  resulting  filter  with  the  image. 
This  has  caused  problems  in  the  derivatives  because  it  was  hard  to  differentiate  the  filter 
at  the  ends  of  the  window.  We  used  a  spline  interpolation  to  smooth  this  discontinuity. 
In  this  section  we  find  an  analytic  solution  for  the  whole  window  without  any  reference 
to  the  situation  outside.  This  is  a  more  rigorous  solution  to  the  truncation  problem  than 
the  spline  interpolation,  and  sometimes  more  accurate,  but  it  is  more  complicated.  It  can 
be  useful  when  the  smooth  part  of  the  shape  is  small  and  there  is  not  much  room  for  a 
convolution  with  a  wide  smoothing  filter,  or  when  a  large  discretization  interval  is  involved 
making  the  spline  interpolation  inaccurate.  The  boundaries  are  now  the  ends  of  the  image, 
not  of  a  moving  window. 
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The  standard  methods  of  mathematical  physics  usually  deal  with  second  order  dif¬ 
ferential  equations  with  only  non- mixed  boundary  conditions  (i.e.  the  conditions  at  one 
boundary  do  not  depend  on  the  other).  Here  we  we  have  a  fourth  order  equation  (at  least) 
with  mixed  boundary  conditions,  so  these  methods  are  not  much  guidance.  We  found  a 
general  technique  that  solves  the  problem  for  any  order  and  boundary  conditions  I  bur  we 
do  not  claim  originality). 

We  start  from  the  general  solution  of  eq.  (3): 

4 

G(x.  o  =  G^{x  -  0  +  b,(^)Hl(x)  (20) 

i=i 

The  G'-c  is  the  solution  in  the  infinite  case  and  it  provides  the  correct  jump  at  x  =  f . 


(1  +  AcdjG^U -i)  =  6(i  -O 

while  the  boundary  conditions  determine  the  coefficients  6,(£ )  of  the  homogeneous  solutions 
H,.  These  conditions  can  be  expressed  by  four  linear  combinations  Bk(G).  k  =  1  ...  4.  of 
all  the  derivatives  G'(ri)  at  the  boundaries  x  =  ±w: 

4 

Bk(G)  =  =  0 

n  =  l 

where  3n,k,3n,k  are  any  given  constants.  Substituting  G  from  eq.  (20)  in  the  above 
conditions  we  obtain  a  system  of  linear  equations  for  &,(£): 

(H\)  +  . . .  +  bi{£)B\{Hi)  =  BiiG^c) 


(21  i 


bi(£)Bi(Hi )  +  . . .  4-  b\(^)B\{H\  )  —  B\{G-yc) 


Since  the  homogeneous  solutions  H ,  do  not  depend  on  .  we  see  that  the  coefficients 
multiplying  b,(  £)  on  the  left  hand  side  are  simply  constants  and  the  only  dependence  on 
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/  rt 


comes  from  the  B^(G0 c)  on  the  right  hand  side.  Solving  the  above  system  for  6t.  we  thus 
obtain  a  linear  combination  of  the  right  hand  side  terms.  B^(G0 c): 

MO  =  ][>.*Bt(Goo(±«r.O) 

k 

Thus,  we  have  found  a  Green  function  which  satisfies  all  conditions,  it  is  easy  to  see  that 
this  is  a  generalization  of  the  standard  Green  function  method  and  can  be  applied  to  any 
order  and  boundary  conditions. 

Finally,  our  general  solution  can  be  written  as 

/U)  =  /  G(x.‘)p(  0 

J  —  U> 

Since  the  Green  function  satisfies  the  boundary  conditions  for  every  £.  so  does  the  linear 
eombination  /(j).  For  the  discrete  case,  an  integrate  approximation  such  as  Simpson's 
rule  can  be  used. 

The  derivatives  at  .r  can  be  expressed  now  as 

/(nt  =  f 

Unlike  the  filter  case,  there  are  no  extra  terms  either  from  integration  by  parts  or  moving 
boundaries,  so  the  approximation  method  used  before  to  avoid  them  is  not  needed.  At¬ 
tention  should  be  paid,  however,  to  the  situation  at  x  =  where  the  third  derivative  is 
discontinuous. 

Our  particular  boundary  conditions  resulted  from  the  preservation  of  the  first  four 
moments  ( eqs.  7).  It  is  easy  to  show  that  these  condition  can  be  written  linearly  as 

B\(G)  =  G"'\'iw  =  0 

B2{G)  =  ( G1  -xG"  if*'^  =  n 

B-i(G)  =  (G"  -  jG"')|"u.  =  0 

B\ ( G )  =  (3 G  -  3.r G'  +  x2G"’  )|^u,  =  0 

where  / j“'u,  denotes  f(ir)  —  }(  —  w).  (These  expressions  are  simpler  when  applied  to  H, 
because  of  [antisymmetry.)  These  are  the  conditions  that  we  substitute  in  eq.  (21)  for 
our  particular  case. 
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7.  Experiments 


We  compared  the  performance  of  our  filters  with  that  of  the  Gaussian-based  ones  in  various 
respects.  The  new  filter  we  used  is  given  in  eq.  (15).  with  given  by  eqs.  (11,12). 

1)  Smoothing  over  a  finite  interval.  Table  1  summarizes  the  results  for  various  powers 
x" .  at  x  ~  1.  The  window  was  wide  enough  to  eliminate  truncation  errors.  For  low 
smoothing,  a  -  0.5.  we  can  see  that  our  filter  gives  the  exact  value  of  1  up  to  x3.  is  slightly 
less  accurate  for  x1.  and  xJ  is  reduced  about  20%.  The  Gaussian's  errors  are  growing 
rapidly.  For  higher  smoothing,  n  =  2.  the  Gaussian  results  are  totally  wrong  for  powers 
x*  and  up.  in  perfect  agreement  with  the  theory.  Our  filter  gives  correct  results  up  to  x3. 
From  Section  2.  we  have  that  the  powers  are  preserved  up  to  the  order  /  of  the  filter  The 
Gaussian's  order  is  1  and  ours  here  is  3.  Section  3  generalizes  our  method  to  higher  orders. 

The  small  errors  that  we  observe  in  our  method  at  a  =  2  have  a  very  systematic 
behavior.  The  error  in  x2  is  independent  of  x  and  is  about  h2  / 12,  while  that  of  x3  was 
seen,  accordingly,  to  be  0{hzx).  This  is  exactly  the  theoretical  error  for  the  "mid-point" 
rule  of  integration  over  x2  and  is  probably  caused  by  using  the  analytic  boundary  conditions 
of  Section  3  rather  than  a  discrete  version.  In  such  a  version  the  analytic  integrals  involved 
are  replaced  by  sums. 

2.  Differentiation.  Table  2  shows  the  errors  in  differentiation  produced  by  truncating 
the  Gaussian  at  various  situations.  The  derivatives  of  the  Gaussian  were  used  as  differentia¬ 
tion  filters.  We  first  tested  at  low  smoothing  (a  =  0.2.  h  =  0.1).  so  that  the  oversmoothing 
errors  demonstrated  above  did  not  play  a  role.  At  a  window  width  of  0.0  the  Gaussian 
decayed  at  the  window's  end  to  0.01  of  its  peak  value,  yet  the  results  at  high  derivatives 
are  totally  meaningless.  Only  when  we  extended  the  window  such  that  the  decay  was  10~e> 
did  the  results  come  near  the  correct  values.  Our  filter  decays  rather  slowly  and  it  would 
be  impractical  to  achieve  this  figure.  However,  taking  the  central  difference  instead  of  the 
derivative  of  our  filter  produce  very  accurate  results  on  a  small  window  of  width  0.5.  (This 
method  should  solve  the  truncation  problem  for  the  Gaussian  too.  but  than  its  other  errors 
will  show  up. ) 
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For  stronger  smoothing,  a  =  2,  h  =  .5,  with  correspondingly  increased  window,  the 
Gaussian  results  were  much  lower  than  the  actual  derivative,  because  of  oversmoothing  at 
the  boundary.  At  a  wider  window,  the  errors  described  in  (1)  above  reappeared  and  made 
the  results  totally  off  the  mark.  With  our  method  there  is  almost  no  truncation  problem 
and  we  see  that  the  main  characteristics  are  similar  to  the  previous  case:  The  method 
gives  good  results  as  long  as  the  power  of  xn  is  no  more  than  three  above  the  order  of  the 
derivative,  consistent  with  the  preservation  of  the  powers  up  to  x3.  Some  truncation  error 
shows  up  only  at  high  derivatives  of  high  powers,  especially  at  pp-x'  ,  but  this  result  is  still 
only  20%  off,  by  far  the  worst  case  here.  We  can  see  that  this  error  is  again  proportional 
to  h2.  For  the  smaller  interval  /i,  the  same  pj-x'  was  accurate  to  1%.  The  cause  for  this 
inaccuracy  is  probably  the  fact  that  the  spline  interpolation  distorts  the  high  power  xn 
around  the  ends  more  than  it  does  the  lower  ones.  (A  method  such  as  described  in  Section 
6  which  does  not  depend  on  this  interpolation  should  not  have  this  problem.)  For  functions 
like  sin(x/so)  we  obtained  much  better  results  than  the  Gaussian  with  a  %  sq. 

3.  Random  noise.  To  see  the  effect  of  random  noise,  we  can  perturb  the  data  by  a 
given  amount  and  observe  the  effect  on  the  derivatives.  We  can  look  at  a  function  such 
as  /  =  (x/2)3,  for  which  the  scale  of  change  as  defined  in  Sec.  2  is  2.  We  know  then  that 
the  smoothing  parameter  a  should  not  be  much  greater  than  2  to  prevent  oversmoothing. 
Thus  we  look  at  a  filter  with  <7  =  2,  with  a  window  width  w  =  5  and  interval  h  =  0.5.  If  we 
perturb  the  data  at  some  pixel  i  by  the  amount  of  A/,  than  the  error  in  the  derivative  at 
the  k- th  pixel  is  Fin)(i  —  k) A /.  So,  the  error  is  proportional  to  the  magnitude  of  the  filter 

elements.  Table  3  gives  the  filter  elements  with  the  above  parameters  for  the  0 . 4-th 

derivatives.  We  can  see  that  the  values  are  on  average  0.05.  giving  an  enor  of  about  5% 
in  the  derivative  for  an  error  of  A/  =  1  in  the  data.  The  higher  derivative  filters  are  no 
worse  the  the  others  in  this  respect,  showing  that  by  choosing  the  right  parameters  high 
derivatives  are  quite  feasible. 

The  Gaussian  derivatives  are  of  the  same  magnitude,  but  their  systematic  errors  make 
them  useless  at  this  <r  as  seen  before.  The  polynomial  based  filters  tend  to  have  higher 
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values  at  the  center  and  very  low  ones  towards  the  ends,  making  them  sensitive  to  noise 
at  the  center  and  reducing  their  ability  to  average  out  the  noise. 

8.  Conclusions 

In  this  paper  we  have  found  the  sources  of  errors  in  common  differentiation  filters  and 
have  shown  ways  of  overcoming  them.  The  errors  we  dealt  with  were  (i)  Over-smoothing 
in  Gaussian-based  filters,  leading  to  inaccurate  results  even  for  the  simplest  functions.  We 
have  shown  that  higher  filters  can  solve  the  problem.  We  have  proposed  a  new  method 
of  deriving  higher  order  filters  that  achieves  a  better  balance  between  smoothing  and 
accuracy  than  previous  ones,  (ii)  Discontinuities  such  as  truncation  of  an  infinite  filter 
can  have  devastating  effects  on  the  derivatives.  We  have  overcome  this  by  replacing  the 
derivative  of  the  smoothing  filter  by  a  central  difference  of  the  same  order  to  form  the 
derivative  filter,  (iii)  Random  noise  can  be  suppressed  without  sacrificing  accuracy  too 
much,  if  the  parameters  of  the  filter  are  chosen  properly;  our  "accuracy  criterion"  gives 
the  relation  among  the  accuracy,  the  smoothing  parameter,  and  the  order  of  the  filter. 

Our  experiments  show  that  with  these  techniques  one  can  obtain  noise  resistant  high 
order  derivatives. 

References 

Besl,  P.J.  [1988],  “Robust  window  operators”.  Proceedings  of  the  Second  International 
Conference  on  Computer  Vision,  Tampa,  FL. 

Blake,  A.  and  Zisserman,  A,  [1987].  Visual  Reconstruction ,  MIT  Press.  Cambridge. 

MA. 

Dierckx,  P.  [1977],  “An  algorithm  for  least  square  fitting  of  cubic  spline  surfaces  to 
functions  on  a  rectilinear  mesh  over  a  rectangle”.  J.  Comp,  and  Appl.  Math..  3.  2. 

Grimson,  W.E.L.  [19S1],  “The  implicit  constraint  of  the  primal  sketch".  MIT  A.I.  Memo 
No.  6G3. 


27 


Haralick,  R.M.  [1984],  "Digital  step  edges  from  zero  crossings  of  second  directional 
derivatives”,  IEEE  Trans.  PA  MI-6.  5S-GS. 

Hashimoto,  M.  and  Sklansky,  J.  [19S7],  "Multi-order  derivatives  for  detecting  local 
image  characteristics”,  Computer  Vision ,  Graphics  and  Image  Processing ,  39.  28-55. 

Horn,  B.K.P.  [19S3],  "The  curve  of  least  energy”,  ACM  Transactions  on  Mathematical 
Software.  9,  441-460. 

Hueckel,  M.F.  [1973],  "A  local  visual  operator  which  recognizes  edges  and  lines”.  .7. 
Assoc.  Comp.  Mach..  20,  634-647. 

Hummel,  R.A.  [1979],  “Feature  detection  using  basis  functions”.  Computer  Graphics 
and  Image  Processing ,  9,  40-55. 

Kass,  M.,  Witkin,  A.,  and  Terzopoulos,  D.  [19S7],  "Snakes,  active  contour  models". 
Int.  J.  Computer  Vision. 

Meer,  P.  and  Weiss,  I.  [1989],  "Smoothed  differentiation  filters  for  images".  Center  for 
Automation  Research,  University  of  Maryland.  TR  424. 

Ortega,  J.M.  and  Poole,  W.G.  [1981],  An  Introduction  to  Numerical  Analysis  for 
Differential  Equations.  Pitman,  Marshfield,  MA. 

Poggio,  T.  [1987],  Proceedings  of  the  Image  Understanding  Workshop.  Los  Angeles.  C'A. 

Poggio,  T.,  Torre,  T.  and  Koch,  C.  [1985].  "Computational  vision  and  regularization 
theory".  Nature.  317.  314-319. 

Poggio,  T.,  Voorhees,  H.  and  Yuille,  A.  [19S5],  “A  regularized  solution  to  edge 
detection".  J.  of  Complexity.  4,  106-123. 

Weiss,  I.  [1988],  "3-D  shape  representation  by  contours”.  Computer  Vision.  Graphics  and 
Image  Processing.  41.  80-100. 

Weiss,  I.  [19SS],  "Projective  Invariants  of  Shapes".  Proceedings  of  the  linage  L  nderstand- 
ing  Workshop.  Cambridge.  MA. 


28 


Weiss,  I.  [1989],  “Image  Smoothing  and  Differentiation  with  Minimal  Curvature  Filters", 
Center  for  Automation  Research,  University  of  Maryland,  TR-470. 

Weiss,  I.  [1991],  “Noise  Resistant  Invariants  of  Curves”,  Invariant  Workshop ,  Iceland. 
Center  for  Automation  Research,  University  of  Maryland,  TR  537. 


29 


Table  1:  Smoothing  on  a  Finite  Interval 


s  =  .5, 

IU  =  1 

Ji  =  0.1 

1 

X 

X2 

X3 

x4 

X5 

Gauss 

1.00 

1.00 

1.25 

1.75 

2.6S 

4.44 

Ours 

1.00 

1.00 

1.00 

1.00 

0.96 

0.80 

5  =  2, 

w  =  5. 

h  =  0.5 

1 

X 

x2 

X3 

X4 

Gauss 

1.00 

1.00 

4.99 

12.99 

72.99 

Ours 

1.00 

1.00 

1.02 

1.06 

-23.27 

Table  2:  Truncation  Error 


5  =  .2,  h  =  .1 


w 

cPx2 

f/°X4 

dAxA 

dAx5 

dAx6 

dAx~ 

Trunc. 

0.6 

1.03 

1.24 

-252 

-361 

-466 

530 

Gauss 

1.0 

1.04 

1.25 

23.4 

118.9 

372.0 

935 

Ours 

.5 

1.0 

1.0 

24.0 

120.0 

361 

850 

correct 

1 

1 

24 

120 

360 

S40 

s  =  2.  h  =  .5 

w  d°x2  d°xA  dAxA  dA  x°  dAx6  dA.i 


Trunc. 

Gauss 

Ours 


0.6  4.9 
1.0  1.99 
.5  1.02 


68.6 

72.9 

-2.1.27 


0.19 
23. S 
24.0 


9.04  281 

119.2  1780.4 

120.0  397.4 


1821 
10793 
1 101.6 


.0 


Table  3:  The  Differentiation  Filters 


5  :  2,  w  —  5,  h  = 

.5 

d° 

dl 

d 2 

d 3 

d4 

0.15938313 

0.0 

-0.07757177 

0.0 

0.1673013 

0.1496S666 

-0.035587692 

-0.056659108 

0.059712945 

0.031345331 

0.12582541 

-0.056247147 

-0.027910113 

0.049725615 

-0.030269221 

0.0949S663S 

-0.064475199 

-0.0067284237 

0.035947677 

-0.023677088 

0.062465755 

-r  063716331 

0.0085339938 

0.026105913 

-0.015705673 

0.03207S371 

-0.056430985 

0.019869993 

0.020178524 

-0.0082267441 

0.00665S4S57 

-0.044101009 

0.029149307 

0.017725599 

-0.0019682933 

-0.011474073 

-0.027339632 

0.037936547 

0.017982888 

0.0024734119 

-0.020122496 

-0.0060825336 

0.04734214 

0.01990211 

-0.052793667 

-0.016935383 

0.020150092 

0.043549316 

-0.10480257 

-0.33844196 
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Figure  1:  Smoothing  Filters:  Minimal-Curvature  and 
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gtire  2:  Differentiation  Filters:  Minimal-Curvature  and  Caussian 
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